Crate os_str_bytes
source ·Expand description
This crate allows interacting with the data stored by OsStr
and
OsString
, without resorting to panics or corruption for invalid UTF-8.
Thus, methods can be used that are already defined on [u8]
and
Vec<u8>
.
Typically, the only way to losslessly construct OsStr
or OsString
from a byte sequence is to use OsStr::new(str::from_utf8(bytes)?)
, which
requires the bytes to be valid in UTF-8. However, since this crate makes
conversions directly between the platform encoding and raw bytes, even some
strings invalid in UTF-8 can be converted.
In most cases, RawOsStr
and RawOsString
should be used.
OsStrBytes
and OsStringBytes
provide lower-level APIs that are
easier to misuse.
Encoding
The encoding of bytes returned or accepted by methods of this crate is intentionally left unspecified. It may vary for different platforms, so defining it would run contrary to the goal of generic string handling. However, the following invariants will always be upheld:
-
The encoding will be compatible with UTF-8. In particular, splitting an encoded byte sequence by a UTF-8–encoded character always produces other valid byte sequences. They can be re-encoded without error using
RawOsString::into_os_string
and similar methods. -
All characters valid in platform strings are representable.
OsStr
andOsString
can always be losslessly reconstructed from extracted bytes.
Note that the chosen encoding may not match how Rust stores these strings
internally, which is undocumented. For instance, the result of calling
OsStr::len
will not necessarily match the number of bytes this crate
uses to represent the same string.
Additionally, concatenation may yield unexpected results without a UTF-8
separator. If two platform strings need to be concatenated, the only safe
way to do so is using OsString::push
. This limitation also makes it
undesirable to use the bytes in interchange.
Since this encoding can change between versions and platforms, it should
not be used for storage. The standard library provides implementations of
OsStrExt
and OsStringExt
for various platforms, which should be
preferred for that use case.
User Input
Traits in this crate should ideally not be used to convert byte sequences
that did not originate from OsStr
or a related struct. The encoding
used by this crate is an implementation detail, so it does not make sense
to expose it to users.
Crate bstr offers some useful alternative methods, such as
ByteSlice::to_os_str
and ByteVec::into_os_string
, that are meant
for user input. But, they reject some byte sequences used to represent
valid platform strings, which would be undesirable for reliable path
handling. They are best used only when accepting unknown input.
This crate is meant to help when you already have an instance of OsStr
and need to modify the data in a lossless way.
Features
These features are optional and can be enabled or disabled in a “Cargo.toml” file.
Default Features
-
memchr - Changes the implementation to use crate memchr for better performance. This feature is useless when “raw_os_str” is disabled.
For more information, see
RawOsStr
. -
raw_os_str - Provides:
Optional Features
-
checked_conversions - Provides:
EncodingError
OsStrBytes::from_raw_bytes
OsStringBytes::from_raw_vec
RawOsStr::cow_from_raw_bytes
RawOsString::from_raw_vec
Because this feature should not be used in libraries, the “OS_STR_BYTES_CHECKED_CONVERSIONS” environment variable must be defined during compilation.
-
conversions - Provides methods that require encoding conversion and may be expensive:
-
print_bytes - Provides implementations of
print_bytes::ToBytes
forRawOsStr
andRawOsString
. -
uniquote - Provides implementations of
uniquote::Quote
forRawOsStr
andRawOsString
.
Nightly Features
These features are unstable, since they rely on unstable Rust features.
-
nightly - Changes the implementation to use the “os_str_bytes” nightly feature and provides:
RawOsStr::as_encoded_bytes
RawOsStr::as_os_str
RawOsStr::from_encoded_bytes_unchecked
RawOsStr::from_os_str
RawOsString::from_encoded_vec_unchecked
RawOsString::into_encoded_vec
- additional trait implementations for
RawOsStr
andRawOsString
When applicable, a “Nightly Notes” section will be added to documentation descriptions, indicating differences when this feature is enabled. However, it will not cause any breaking changes.
This feature will cause memory leaks for some newly deprecated methods. Therefore, it is not recommended to use this feature until the next major version, when those methods will be removed. However, it can be used to prepare for upgrading and determine impact of the new feature.
Because this feature should not be used in libraries, the “OS_STR_BYTES_NIGHTLY” environment variable must be defined during compilation.
Implementation
Some methods return Cow
to account for platform differences. However,
no guarantee is made that the same variant of that enum will always be
returned for the same platform. Whichever can be constructed most
efficiently will be returned.
All traits are sealed, meaning that they can only be implemented by this crate. Otherwise, backward compatibility would be more difficult to maintain for new features.
Complexity
Conversion method complexities will vary based on what functionality is
available for the platform. At worst, they will all be linear, but some can
take constant time. For example, RawOsString::into_os_string
might be
able to reuse its allocation.
Examples
use std::env;
use std::fs;
use os_str_bytes::RawOsStr;
for file in env::args_os().skip(1) {
if !RawOsStr::new(&file).starts_with('-') {
let string = "Hello, world!";
fs::write(&file, string)?;
assert_eq!(string, fs::read_to_string(file)?);
}
}
Modules
- iter
raw_os_str
Iterators provided by this crate.
Structs
- EncodingError
checked_conversions
The error that occurs when a byte sequence is not representable in the platform encoding. - RawOsStr
raw_os_str
A container for borrowed byte strings converted by this crate. - RawOsString
raw_os_str
A container for owned byte strings converted by this crate.
Traits
- OsStrBytes
conversions
A platform agnostic variant ofOsStrExt
. - OsStrBytesExt
nightly
andraw_os_str
An extension trait providing methods fromRawOsStr
. - OsStringBytes
conversions
A platform agnostic variant ofOsStringExt
. - Pattern
raw_os_str
Allows a type to be used for searching byRawOsStr
andRawOsString
. - RawOsStrCow
raw_os_str
Extensions toCow<RawOsStr>
for additional conversions.